Dealing with Discrepancies in Wrapper Functionality
نویسندگان
چکیده
Much of the world's information is stored electronically in data sources. The data sources can be full-edged databases, simple les, HTML pages or specialized data sources that possess diverse query processing capabilities. The common architecture to integrate such sources consists of mediators that give a global view over the content of all sources, and wrappers that give a local view of each source. Answering queries in this architecture is a diicult problem due to the wide range of capabilities of data sources. This paper presents a solution to this problem in the context of the Disco query processor. We provide a tool to the wrapper implementor to describe the capabilities of the wrapper in ne detail. When a wrapper is registered with the mediator, the mediator uploads the capabilities of the wrapper, and smoothly integrates these capabilities into query processing. Our solution is novel both in the level of detail permitted by the tool and its easy incorporation into existing query optimization strategies. In this paper we describe: the query processing of Disco, the language for specifying wrapper capabilities, the algorithms that integrates these capabilities into query processing, and an implementation of these techniques in the Disco prototype. (a) le traitement de requetes dans Disco, (b) le langage de sp eciication des capacit es des adaptateurs, (c) les algorithmes pour integrer ces capacit es dans le processeur des requ^ etes et (d) l'impl ementation de ces techniques dans notre prototype.
منابع مشابه
Intelligent Wrapping of Information Sources: Getting Ready for the Electronic Market
Literature search and delivery in the World Wide Web becomes a rapidly expanding market. Up to now the search is mostly cost-free. But in the future we expect the appearance of more and more providers charging for their services. The main problems are finding the right provider and extracting the information. UniCats is a system for intelligent information search and extraction from multiple pr...
متن کاملDeveloping a Filter-Wrapper Feature Selection Method and its Application in Dimension Reduction of Gen Expression
Nowadays, increasing the volume of data and the number of attributes in the dataset has reduced the accuracy of the learning algorithm and the computational complexity. A dimensionality reduction method is a feature selection method, which is done through filtering and wrapping. The wrapper methods are more accurate than filter ones but perform faster and have a less computational burden. With ...
متن کاملSUPPORTING QUERY PROCESSING ACROSS APPLICATION SYSTEMS Aspects of Wrapper-Based Foreign Function Integration
With the emergence of so-called application systems which encapsulate databases and related application components, pure data integration using, for example, a federated database system is not possible anymore. Instead, access via predefined functions is the only way to get data from an application system. As a result, the combination of generic query as well as predefined function access is ne...
متن کاملThe Wargo System: Semi-Automatic Wrapper Generation in Presence of Complex Data Access Modes
Semi-automatic wrapper generation tools aim to ease the task of building structured views over web sources. But the wrapper generation techniques presented up to date show several weaknesses when dealing with the complex commercial web sources of today, specially when constructing advanced navigational sequences for accessing data. We present Wargo, a semi-automatic wrapper generation tool, whi...
متن کاملFuzzy-rough Information Gain Ratio Approach to Filter-wrapper Feature Selection
Feature selection for various applications has been carried out for many years in many different research areas. However, there is a trade-off between finding feature subsets with minimum length and increasing the classification accuracy. In this paper, a filter-wrapper feature selection approach based on fuzzy-rough gain ratio is proposed to tackle this problem. As a search strategy, a modifie...
متن کامل